Computing Z-series of hyperelliptic curves 



Kiran S. Kcdlaya"^ and Andrew V. Sutherland 

Department of Mathematics 
Massachusetts Institute of Technology 
77 Massachusetts Avenue 
Cambridge, MA 02139 
(kedlay a I drew) Omath . mit . edu 



Abstract. We discuss the computation of coefficients of the L-series 
associated to a hyperelliptic curve over Q of genus at most 3, using point 
counting, generic group algorithms, and p-adic methods. 

1 Introduction 

For C a smooth projective curve of genus g defined over Q, the L-function 
Lie, s) is conjecturally (and provably for g ^ 1) an entire function containing 
much arithmetic information about C Most notably, according to the conjecture 
of Birch and Swinnerton-Dyer, the order of vanishing of L{C, s) at s = 1 equals 
the rank of the group J(C/Q) of rational points on the Jacobian of C . 

It is thus natural to ask to what extent we are able to compute with the 
L-function. This splits into two subproblems: 

1. For appropriate iV, compute the first N coefficients of the Dirichlet series 
expansion L(C, s) = Hp Lpip^^y^ = J2n=i Cnn~'. 

2. From the Dirichlet series, compute L{C, s) at various values of s to suitable 
numerical accuracy. (The Dirichlet series converges for Real(s) > 3/2.) 

In this paper, we address problem 1 for hyperelliptic curves of genus g < 3 with 
a distinguished rational Weierstrass point. This includes in particular the case 
of elliptic curves, and indeed we have something new to say in this case; we can 
handle significantly larger coefficient ranges than other existing implementations. 
We say nothing about problem 2; we refer instead to [5]. 

Our methods combine efficient point enumeration with generic group algo- 
rithms as discussed in the second author's PhD thesis [23]. For g > 2, we also 
apply p-adic cohomological methods, as introduced by the first author [12] and 
refined by Harvey [9]. Since what we need is adequately described in these papers, 
we focus our presentation on the point counting and generic group techniques 
and use an existing p-adic cohomological implementation provided by Harvey. 
(The asymptotically superior Schoof-Pila method [16, 15] only becomes practi- 
cally better far beyond the ranges we can hope to handle.) 
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As a sample application, we compare statistics for Frobenius eigenvalues of 
particular curves to theoretical predictions. These include the Sato- Tate conjec- 
ture for g = 1, and appropriate analogues in the Katz-Sarnak framework for 
g > 1', for the latter, we find little prior numerical evidence in the literature. 

2 The Problem 

Let C be a smooth projective curve over Q of genus g . We wish to determine the 
polynomial Lp{T) appearing in L{C, s) —JJ Lp{p~^)~^, for p < N. We consider 
only p for which C is defined and nonsingular over Fp (almost all of them), 
referring to [17, 4] in the case of bad reduction. The polynomial Lq{T) appears 
as the numerator of the local zcta function 

Z(C/F,; T) = exp N,T^ = _!;.fP_ ^^.y (1) 

where Nk counts the points on C over W^k . Here q is any prime power, however we 
are primarily concerned with q = p an odd prime. The rationality of Z{C/¥q] T) 
is part of the well known theorem of Weil [25], which also requires 

2s 

L,{T)=Y,a,T' (2) 

1=0 

to have integer coefficients satisfying oq = 1 and a2g-j = P^ ''o.ji for < j < g. 
To determine Lq{T), it suffices to compute ai, . . . , a^. 

For reasons of computational efficiency we restrict ourselves to curves which 
may be described by an afhne equation of the form y-^ = f{x), where f{x) is a 
monic polynomial of degree d = 2g + l (hyperelliptic curves with a distinguished 
rational Weierstrass point). We denote by J{C/¥q) the group of F^-rational 
points on the Jacobian variety of C over F^ (the Jacobian of C over F^), and 
use J{C/¥q) to denote the Jacobian of the quadratic twist of C over F^. 

We consider three approaches to determining Lp{T) for g < 3: 

1. Point counting: Compute iVi,. . . ,Ng of (1) by enumerating the points on C 
over Fp, Fp2 , . . . , Fps . The coefficients oi , . . . , can then be readily derived 
from (1) [3, p. 135]. This requires 0{p^) field operations. 

2. Group computation: Use generic algorithms to compute Lp{l) = ^J{C /¥p), 
and, for g > 1, compute ivp(— 1) = ^J{C /¥p). Then use Lp{l) and Lp(— 1) 
to determine Lp{T) [22, Lemma 4]. This involves a total of 0(p(29-i)/4) 
group operations. 

3. p-adic methods: Apply extensions of Kedlaya's algorithm [12, 9] to com- 
pute (modulo p) the characteristic polynomial x{T) — T~^^ Lp{T) of the 
Frobenius endomorphism on J(C/Fp), then use generic algorithms to com- 
pute the exact coefficients of Lp{T). The asymptotic complexity is 0{p^^^)} 

^ For fixed g > 4, one works modulo pLs/2-iJ Q^tain the same complexity. 



Computing the coefficients of Lp(T) for all p < necessarily requires time 
and space exponential in IgiV, since the output contains 0{N) bits. In practice, 
we are limited to N of moderate size: on the order of 2'^^ in genus 1, 2^^ in genus 
2, and 2^^ in genus 3 (larger in parallel computations). We expect to compute 
Lp{T) for a large number of relatively small values of p. Constant factors will 
have considerable impact, however we first consider the asymptotic situation. 

The 0{p^) complexity of point counting makes it an impractical method to 
compute ai, . . . , unless p is very small. However, point counting over ¥p is 
an efficient way to compute ai = Ni — p — 1 for a reasonably large range of 
p when 5 > 1, requiring only 0{p) field operations. Knowledge of ai aids the 
computation of ^J{C/¥p), reducing the complexity of the baby-steps giant-steps 
search to 0{p-^/^) in genus 2 and 0{p) in genus 3. The optimal strategy then 
varies (c.f. [6, pp. 32-33]), according to genus and range of p: 

Genus 1 The 0{p^^'^) complexity of generic group computation makes it the 
compelling choice, easily outperforming point counting for p > 2^*^. 

Genus 2 There are three alternatives: (i) 0{p) field operations followed by 
group operations, (ii) 0{p^^^) group operations, or (iii) an ©(p^/'^) p- 
adic computation. We find the range in which (iii) becomes optimal to be past 
the feasible values of N. 

Genus 3 The choice is between (i) 0{p) field operations followed by 0{p) group 
operations and (ii) an 0(p^/^) p-adic computation followed by 0{p^^^) group 
operations. Here the p-adic algorithm plays the major role once p > 2^^. 

3 Point Counting 

Counting points on C over Fp plays a key role in our strategy for genus 2 and 3 
curves. Moreover, it is a useful tool in its own right. If one wishes to study the 
distribution of #J(C/Fp) — Lp{l), or to simply estimate Lp{p^'^), the value oi 
may be all that is required. 

Given C in the form = /(a;), the simplest approach is to build a table of the 
quadratic residues in Fp (typically stored as a bit-vector), then evaluate f{x) for 
all X € Fp. If f{x) = 0, there is a single point on the curve, and otherwise either 
two points (if f{x) is a residue) or none. Additionally, we add a single point at 
infinity (recall that / has odd degree). A not-too-naiVe implementation computes 
the table of quadratic residues by squaring half the field elements, then uses d 
field multiplications and d field additions for each evaluation of f{x), where d is 
the degree of /. A better approach uses finite differences, requiring only d field 
additions (subtractions) to compute each f{x). 

Let f{x) = ^ fjX^ be a degree d polynomial over a commutative ring R. Fix 
a nonzero (5 G i? and define the linear operator A on R[x] by 



For any xq £ R[x], given /{xq), we may enumerate the values f{xo + nS) via 



{Af){x) = f{x + S)^f{x). 



(3) 



f{xo + {n + 1)S) = f{xo + nS) + Af{xo + n6). 



(4) 



To enumerate f{xo + nS) it suffices to enumerate A.f(xo + nS), which we also 
do via (4), replacing / with A/. Since A'*+^/ = 0, each step requires only d 
additions in R, starting from the initial values A.'^f{xo) for < fc < d. 

When R = ¥p, this process enumerates f{x) over the entire field and we 
simply set (5 = 1 and xq = 0. As subtraction modulo p is typically faster than 
addition, instead of (4) we use 



The necessary initial values are then (— 1)'"' A/(0). 

Algorithm 1 (Point Counting over Fp) Given a polynomial f (x) overWp of 
odd degree d and a vector M identifying nonzero quadratic residues in F^; 

1. Set tk ^ {-!)'' A'' f{0), for < k < d, and set iV ^ 1. 

2. For i from 1 to p: 

(a) If to = 0, set iV ^ + 1, and if M[to], set N ^ N + 2. 

(b) Set to to -ti, ti ^ ti - to, . . . , and td-i ^ td-i - ta- 

Output N. 

The computation tk — tk ~ tk+i is performed using integer subtraction, 
adding p if the result is negative. The map M is computed by enumerating the 
polynomial f{x) = for x from 1 to (p — l)/2 and setting M[f{x)] = 1. using 
a total of p subtractions (and no multiplications). 

The size of M may be cut in half by only storing residues less than p/2. One 
then uses M[min(<:o,P — ^o)], inverting M[p — to] when p = 3 mod 4. This slows 
down the algorithm, but is worth doing if Af exceeds the size of cache memory. 

It remains only to compute A'^/(0). We find that 



where the bracketed coefficient denotes a Stirling number of the second kind. 
The triangle of values Tj^k is represented by sequence A019538 in the OEIS [18]. 
Since (6) does not depend on p, it is computed just once for each k < d. 

In the process of enumerating f{x), we can also enumerate f{x) + g{x) with 
e + 1 additional field subtractions, where e is the degree of g{x). The case where 
g{x) is a small constant is particularly efficient, since nearby entries in M are 
used. The last two columns in Table 1 show the amortized cost per point of 
applying this approach to the curves y"^ = f{x), f{x) + l, . . . , /(x) + 31. 

4 Group Computations 

The performance of generic group algorithms is typically determined by two 
quantities: the time required to perform a group operation, and the number of 
operations performed. We briefly mention two techniques that reduce the former, 
then consider the latter in more detail. 



f{xo + {n + 1)S) = f{xo + nS) - (-A/)(.to + nS). 



(5) 
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Table 1. Point counting = fix) over Fp (CPU nanoseconds/point) 

The middle rows of Table 1 show the transition of M from L2 cache to general memory. 
The top section of the table is the most relevant for the algorithms considered here, as 
asymptotically superior methods are used for larger p. 

4.1 Faster Black Boxes 

The performance of the underlying finite field operations used to implement the 
group law on the Jacobian can be substantially improved using a Montgomery 
representation to perform arithmetic modulo p [14]. Another optimization due 
to Montgomery that is especially useful for the algorithms considered here is 
the simultaneous inversion of field elements (see [3, Alg. 11.15]).^ With an affine 
representation of the Jacobian each group operation requires a field inversion, 
but uses fewer multiplications than alternative representations. To ameliorate the 
high cost of field inversions, we then modify our algorithms to perform group 
operations "in parallel" . 

In the baby-steps giant-steps algorithm, for example, we fix a small constant 
n, compute n "babies" /3, 0^ , . . . , /3", then march them in parallel using steps 
of size n (the giant steps are handled similarly). In each parallel step we execute 
n group operations to the point where a field inverse is required, perform all the 
field inversions together for a cost of 3n — 3 multiplications and one inversion, 
then use the results to complete the group operations. Exponentiation can also 
benefit from parallelization, albeit to a lesser extent. 

These two optimizations are most effective when applied in combination, as 
may be seen in Table 2. 



^ This algorithm can be applied to any group. 
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Table 2. Black box performance (CPU nanoseconds/group operation) 

The heading xn indicates n group operations performed "in parallel". All times are 
for a single thread of execution. 

4.2 Generic Order Computations 

Our approach to computing ^J{C/¥q) = Lq{l) is based on a generic algorithm 
to compute the structure of an arbitrary abehan group [23]. We are aided both 
by absolute bounds on Lq{l) derived from the Weil conjectures (theorems), as 
well as predictions regarding its distribution within these bounds based on a 
generalized form of the Sato- Tate conjecture (proven for most genus 1 curves 
over Q in [7]). We first consider the general algorithm. 

We assume we have a black box for an abelian group G (written multiplica- 
tively) that can generate uniformly random group elements. For Jacobians, these 
can be obtained via decompression techniques [3, 14.1-2].^ We also suppose we 
are given bounds Mq and Mi such that Mq < |G| < Mi. 

The first (typically only) step is to compute the group exponent, A(G), the 
least common multiple of the orders of all the elements of G. This is accomplished 
by initially setting E ~ I, and for a random a €z G, computing the order of 
P = using a baby-steps giant-steps search on the interval [Mq/ E, Mi/ E]. 
We then update E ^ \P\E and repeat the process until either (1) there is only 
one multiple of E in the interval [Mo,Mi], or (2) we have generated c random 
elements, where c is a confidence parameter. In the former case we must have 
\G\ ~ E, and in the latter case E ~ A(G), with probability greater than 1 — 2^^'^ 
[23, Proposition 8.3]. For large Jacobians, (1) almost always applies, however for 
the relatively small groups considered here, (2) arises more often, particularly 
when g > 1. Fortunately, this does not present undue difficulty. 

Proposition 2. Given A(G) and Mq such that Mq < \G\ < 2Mo, the value of 
\G\ can be computed using 0{\G\^^^) group operations. 

^ This becomes costly when g > 2, where we use the simpler approach of [3, p. 307]. 



Proof (sketch). The bounds on |G| imply that it is enough to know the order of 
all but one of the p-Sylow subgroups of G (the p dividing \G\ are obtained from 
A(G)). Following Algorithm 9.1 of [23], we use A(G) to compute the order of 
each p-Sylow subgroup H Q G using 0(|iJ|^/^) group operations; however, we 
abandon the computation for any p-Sylow subgroup that proves to be larger than 
•^/[gJ. This can happen at most once, and the remaining successful computations 
uniquely determine |G| within the interval [Mq,2Mq). □ 

From the Weil interval (see (8) in section 4.4) we find that Mi < 2Mq for all 
q > 300 and 5 < 3. Proposition 2 implies that group structure computations will 
not impact the complexity of our task. Indeed, computing # J(G/Fg) is almost 
always dominated by the first computation of 

Given /3 e G and the knowledge that the interval [Mq, AIi] contains an integer 
M for which /3*^ = 1^, a baby-steps giant-steps search may be used to find such 
an M. This is not necessarily the order of (3, it is a multiple of it. We can then 
factor M and compute using O(lgM) group operations [23, Ch. 7]. The time 
to factor M is negligible in genus 2 and 3 (compared to the group computations), 
and in genus 1 we note that if a sieve is used to enumerate the primes up to 
A'^, the factorization of every M in the interval [Mq, Mi] can be obtained at 
essentially no additional cost, using 0{VN) bytes of memory. 

An alternative approach avoids the computation of from M by attempting 
to prove that M is the only multiple of |/?| in the interval. Write [Mo,Mi] as 
[C — R,C + R], and suppose the search to find M = G ± r has shown /3" 7^ 1g 
for all n in {C -r,G + r). If M is not the only multiple of |/3| in [G-R,G + R], 
then \f3\ is a divisor of M satisfying 2r < |/3| < R + r. In particular, if P is 
the largest prime factor of M and P > R + r and M/P < 2r, then M must 
be unique. When R ~ 0(A/^/^) this happens fairly often (about half the time). 
When it does not happen, one can avoid an 0(lgA/) order computation at the 
cost of 0{R^/'^) group operations by searching the remainder of the interval on 
the opposite side of M . This is only worthwhile when R is quite small, but can 
be helpful in genus 1.^ 

4.3 Optimized Baby-Steps Giant-Steps in the Jacobian - Part I 

The Mumford representation of J{C/¥q) uniquely represents a reduced divisor 
of the curve = f{^) by a pair of polynomials {u,v). The polynomial u is 
monic, with degree at most g, and divides — f [3, p. 307]. The inverse of (u, v) 
is simply (w, — w), which makes two facts immediate: 

1. The cost of group inversions is effectively zero. 

2. The element {u, v) has order 2 if and only ii v = and u divides /. 

Fact 1 allows us to apply the usual optimization for fast inverses [2, p. 250], 
reducing the number of group operations by a factor of a/2 (we no longer count 
inversions). Fact 2 gives us a bijection between the 2-torsion subgroup of J{G/¥q) 



* These ideas were sparked by a conversation with Mark Watkins, who also credits 
Geoff Bailey. 



and polynomials dividing / of degree at most g (exactly half the polynomials 
dividing /). If k counts the irreducible polynomials in the unique factorization 
of /, then the 2-rank of J{C/¥q) is fc - 1 and 2^-^ divides #J(C/F,).5 

When fc > 1, we start with E = 2'^^^ in our computation of \{G) above, 
reducing the number of group operations by a factor of 2'^'^^^'/^. Otherwise, we 
know ^J{C/¥q) is odd and can reduce the number of group operations by a 
factor of V2- The total expected benefit of fast inversions and knowledge of 2- 
rank is at least a factor of 2.10 in genus 1. 2.31 in genus 2, and 2.48 in genus 3. 



4.4 Optimized Baby-Steps Giant-Steps in the Jacobian - Part II 

We come now to the most interesting class of optimizations, those based on the 
distribution of #J(C/Fg). The Riemann hypothesis for curves (proven by Weil) 
states that LqlT) has roots lying on a circle of radius g"^/^ about the origin of 
the complex plane. As Lq{T) is a real polynomial of even degree with iq(0) = 1, 
these roots may be grouped into conjugate pairs. 

Definition 3. A unitary symplectic polynomial p(z) is a real polynomial of even 
degree with roots ai, ...ag,ai, ...oig all on the unit circle. 

The unitary symplectic polynomials are precisely those arising as the char- 
acteristic polynomial of a unitary symplectic matrix. The Riemann hypothesis 
for curves implies that p{z) — Lg{zq~^/^) is a unitary symplectic polynomial. 
The coefficients of p{z) = ^ ajz^ may be bounded by 

Kl <('/)■ (7) 

The corresponding bounds on the coefRcients of Lq{T) constrain the value of 
Lq{l) = #J{C/¥q), yielding the Weil interval 

(^/9 - 1)'^ < # J(C/F,) <{^+ 1)2^. (8) 

For the Uj with j odd, the well known bounds in (7) are tight, however for even 
J they are not. We are particularly interested in the coefficient a2- 

Proposition 4. Letp{z) = '^ajZ^ be a unitary symplectic polynomial of degree 
2g. For fixed ai, 02 is bounded by an interval of radius at most g. In fact 

a2<5+(^)a?; (9) 
a2 > -.g + 2 + [af - S^) 12. (10) 

The value 5 < 2 is the distance from oi to the nearest integer congruent to 
mod 4 (when g is odd), or 2 mod 4 (when g is even). 



^ Computing k requires only a distinct-degree factorization of /, see [2, Alg. 3.4.3]. 



Proof. Define (3j = aj + aj for 1 < j < g, where the aj are the roots of p(z). 
Then ai ~ J2f^j ^^'^ 0-2 = g + {a{ — t2)/2, where = J2(^'j- For fixed ai, t2 is 
minimized by Pj = ai/g, yielding (9), and <2 is maximized by f3j = ±2 for j < g 
and Pg = 5, yielding (10) (note that \ < 2). The proposition follows. □ 

We have as a corollary, independent of oi, the bound a2 > — <?, and for g 
odd, 02 > 2 — g. In genus 2, the proposition reduces to Lemma 1 of [13], however 
we are especially interested in the genus 3 case, where our estimate of 02 will 
determine the leading constant factor in the time to compute #J(C/Fg). In 
genus 3, Proposition 4 constrains 02 to an interval of radius 3 once ai is known, 
whereas (7) would give a radius of 15. 

Having bounded the interval as tightly as possible, we consider the search 
within. We suppose we are seeking the value of a random variable X with some 
distribution over [Mo, Mi]. We assume that we start from an initial estimate 
M and search outward in both directions using a standard baby-steps giant- 
steps search with all baby steps taken first (see [20] for a more general analysis) . 
Ignoring the boundaries, the cost of the search is 



group operations. As our cost function is linear in \X — M\, we minimize the 
mean absolute error in our estimate by setting M to the median value of X and 
s = V2E, where E is the expectation of |X — Af|. This holds for any distribution 
on X, we simply need the median value of X and its expected distance from it. 

If wc consider p{z) — Lq{zq~^/'^) as a "random" unitary symplectic poly- 
nomial, a natural distribution for p{z) can be derived from the Haar measure 
on the compact Lie group U Sp{2g) (the group of 2g x 2g matrices over C that 
are both unitary and symplectic). Each p{z) corresponds to a conjugacy class 
of matrices with p{z) as their characteristic polynomial. Let the eigenvalues of 
a random matrix in USp{2g) be e^'^S e^^^n, with 9j G [0, tt). The joint 
probability density function on the 9j given by the Haar measure on USp{2g) is 



This distribution is derived from the Weyl integration formula [26, p. 218] and 
can be found in [11, p. 107]. For g = 1, this simplifies to {2/it)sit? OdO, which 
corresponds to the Sato- Tate distribution. We may apply (12) to compute various 
statistical properties of random unitary symplectic polynomials. The coefficient 
a\ is simply the negative sum of the eigenvalues. 



and we find that the median (and expectation) of ai is 0. In genus 1, the expected 
distance of oi from its median is 



s + 2\X - M\/s 



(11) 




(12) 



9 




(13) 




(14) 



The value 8/(37r) « 0.8488 is not much smaher than 1, which corresponds to 
a uniform distribution, so the potential benefit is small in genus 1. In genus 2, 
however, the expected distance of ai from its median is 4096/(6257r^) « 0.7905, 
versus an expected distance of 2 for the uniform distribution. The corresponding 
values for genus 3 are « 0.7985 and 3. 

Given the value of ai wc can take this approach further, computing the 
median and expected distance for 02 conditioned on ai. Applying (12), we pre- 
compute a table of median and expected distance values for 02 for various ranges 
of ai. In genus 3, we find that the largest expected distance for 02 given ai is 
about 0.66, much smaller than the value 7.5 for a uniform distribution of 02 over 
the interval given by (7). 

Of course such optimizations are effective only when the polynomials Lp{T) 
for a particular curve and relatively small values of p actually correspond to 
(apparently) random unitary symplectic polynomials. For g > 1, it is not known 
whether this occurs at all, even as p — > 00.^ In genus 1, while the Sato- Tate 
conjecture is now largely proven over Q [7], the convergence rate remains the 
subject of conjecture. Indeed, the investigation of such questions was one moti- 
vation for undertaking these computations. It is only natural to ask whether our 
assumptions arc met. 




Histogram of actual 02 values Distribution of a2 given by (12) 



The figure on the left is a histogram of a2 coefficient values obtained by com- 
puting Lp(r) for p < 2^4 for an arbitrarily chosen genus 3 curve (see Table 6). 
The figure on the right is the distribution of 02 predicted by the Haar measure 
on USp{2g), obtained by numerically integrating 

02 = .9 + JJ^ 4 cos dj cos 9k (15) 

]<k 

over the distribution in (12). The dotted lines show the height of the uniform 
distribution. Similarly matching graphs are found for the other coefficients. 

This remarkable degree of convergence is typical for a randomly chosen curve. 
We should note, however, that the generalized form of the Sato- Tate conjecture 
considered here applies only to curves whose Jacobian over Q has a trivial en- 
domorphism ring (isomorphic to Z), so there are exceptional cases. In genus 1 
these are curves with complex multiplication. In higher genera, other exceptional 
cases occur, such as the genus 2 QM-curves considered in [10]. 



Results are known for certain universal families of curves, e.g. [11, Thm. 10.8.2]. 



5 Results 



To compare different methods for computing Lp{T) and to assess tlie feasible 
range of L-series computations, we conducted extensive performance tests. Our 
test platform consisted of eight networked PCs, each equipped with a 2.5GHz 
AMD Athlon processor running a 64-bit Linux operating system. The point- 
counting and generic group algorithms were implemented using the techniques 
described in this paper, and we incorporated David Harvey's source code for 
the p-adic computations (the algorithm of [9], including recent improvements 
described in [8]). All code was compiled with the GNU C/C++ compiler using 
the options "-02 -m64 -mtune=k8" [19]. 

In genus 1 there are several existing implementations of the computation 
contemplated here: given an elliptic curve defined over Q, determine the coef- 
ficient oi of Lp{T) = pT^ + aiT + 1 for all p < N. We were able to compare 
our implementation with two software packages specifically optimized for this 
purpose: Magma [1], and the PARI hbrary [24] as incorporated in SAGE [21]. 
The range of N we could use in this comparison was necessarily limited; results 
for larger N may be found in Table 5. 



N 


PARI 


Magma 


smalljac 


2i6 


0.26 


0.29 


0.07 




0.55 


0.59 


0.15 


2i8 


1.17 


1.24 


0.30 


2i9 


2.51 


2.53 


0.62 


220 


5.46 


5.26 


1.29 


221 


11.67 


11.09 


2.65 


222 


25.46 


23.31 


5.53 


223 


55.50 


49.22 


11.56 


224 


123.02 


104.50 


24.31 


225 


266.40 


222.56 


51.60 


226 


598.16 


476.74 


110.29 


227 


1367.46 


1017.55 


233.94 


228 


3152.91 


2159.87 


498.46 


229 


7317.01 


4646.24 


1065.28 


230 


17167.29 


10141.28 


2292.74 



Table 3. L-series computations in genus 1 (CPU seconds) 

Each row lists CPU times for a single thread of execution to compute the coefRcient ai 
of Lp{T) for all p < iV, using the elliptic curve y'^ ^ + 314159.T + 271828. In SAGE, 
the function aplist(A'^) performs this computation via the PARI function e//ap(N). 
The corresponding function in Magma is Traces OfFrobenius{N). The column labeled 
"smalljac" list times for our implementation. 



Before undertaking similar computations in genus 2 and 3, we first deter- 
mined the appropriate algorithm to use for various ranges of p using Table 4. 



Each row gives timings for the algorithms considered here, averaged over a smah 
sample of primes of similar size. 





Genus 2 - Lp{T) 




Genus 3 - Lp{T) 


Genus 3 - ai 


pts/grp 


group p- 


adic 


pts/grp p-adic/grp 


points 


2l4 


0.22 


0.55 


4 


10 15 


0.12 


2l5 


0.34 


0.88 


6 


21 23 


0.23 


2i6 


0.56 


1.33 


8 


43 31 


0.45 




0.98 


2.21 


11 


82 40 


0.89 


2l8 


1.82 


3.42 


17 


51 


1.78 


2i9 


3.44 


5.87 


27 


67 


3.57 


220 


7.98 


10.1 


40 


97 


8.48 


221 


18.9 


17.9 


66 


148 


19.7 


222 


52 


35 


104 


212 


56 


223 




54 


176 


355 


123 


224 




104 


288 


577 


738 


225 




173 


494 


995 


1870 


226 




306 


871 


1753 


4550 


227 




505 


1532 


3070 


9800 



Table 4. Lp{T) computations (CPU milliseconds) 

Random curves of the appropriate genus were generated with coefficients uniformly 
distributed over [1,2*). The polynomial Lp{T) was then computed for 100 primes 
Ri 2*, with the average CPU time listed. Columns labeled "pts/grp" compute ai by 
point counting over Fp, followed by a group computation to obtain Lp{T). The column 
"p-adic/grp" computes Lp{T) mod p, then applies a group computation to get Lp{T). 
The rightmost column computes just the coefficient ai, via point counting over Fp. 

The task of computing i-scries coefficients is well-suited to parallel computa- 
tion. We implemented a simple distributed program which partitions the range 
[1, N] into subintcrvals Ii, I2, • . • , Im-, distributes the task of computing Lp{T) 
for p & Im to n CPUs on a network, then collects and collates the results. This 
is useful even on a single computer whose microprocessor may have two or more 
cores. On our 8 node test platform we had 16 CPUs available for computation. 
Tables 5 and 6 lists elapsed times for L-series computations in single and 8-node 
configurations. 

For practical reasons, wc limited the duration of any single test. Larger com- 
putations could be undertaken with additional time and/or computing resources, 
without requiring software modifications. As they stand, the results extend to 
values of N substantially larger than any we could find in the literature. 

Source code for the software can be freely obtained under a GNU General 
Public License (GPL) and is expected to be incorporated into SAGE. It is a 
pleasure to thank William Stein for access to the SAGE computational resources 
at the University of Washington, and especially David Harvey for providing the 
code used for the p-adic computations. 



Genus 1 Genus 1 



iV 


Xl 


x8 


N 


xl 


x8 


221 


1.5 


0.5 


230 


20:43 


2:41 


222 


3.1 


0.7 


231 


45:13 


5:52 


223 


6.3 


1.1 


232 


1:45:45 


13:12 


224 


13.3 


2.0 


233 


4:24:50 


32:51 


225 


28.2 


4.2 


234 


10:16:11 


1:16:18 


226 


59.2 


8.1 


235 


23:15:58 


2:52:47 


227 


126.2 


16.6 


236 




6:29:46 


228 


271.3 


35.1 


237 




14:44:33 


229 


578.0 


74.5 


238 




33:11:08 



Table 5. L-series computations in genus 1 (elapsed times) 

For the elliptic curve = + 314159a; + 271828, the coefficients of Lp{T) were 
computed for all p < N. Columns labeled xn list total elapsed times (seconds or 
hh:mm:ss) for a computation performed on n nodes (two cores per node), including 
communication overhead and time spent collating responses. 



N 


Genus 2 


Genus 3 


Genus 3 - 


- Hi only 


xl 


x8 


xl 


x8 


xl 


x8 


2i6 


1 


< 1 


43 


13 


1 


< 1 


2" 


4 


2 


1:49 


18 


5 


1 


2i8 


12 


3 


4:42 


41 


11 


2 


2i9 


40 


7 


12:43 


1:47 


41 


6 


220 


2:32 


24 


36:14 


4:52 


2:41 


21 


221 


10:46 


1:38 


1:45:36 


13:40 


11:33 


1:27 


222 


40:20 


5:38 


5:23:31 


41:07 


53:26 


6:38 


223 


2:23:56 


19:04 


16:38:11 


2:05:40 


4:33:26 


33:00 


224 


8:00:09 


1:16:47 




6:28:25 


38:51:07 


4:42:43 


225 


26:51:27 


3:24:40 




20:35:16 




20:35:16 


226 




11:07:28 










227 




36:48:52 











Table 6. L-series computations in genus 2 and 3 (elapsed times) 
The coefficients of Lp{T) were computed for the genus 2 and 3 hyperelliptic curves 
y"^ = x^ + 31419a:^ + 2718282;^ + 1644934a: + 57721566; 

j/^ = + 314159x^ + 2718282;'' + 1644934a;^ + 57721566a:^ + 1618034a; + 141421, 

for all p < A'^ where the curves had good reduction. Columns labeled xn list total 
elapsed wall times (hh:mm:ss) for a computation performed on n nodes, including all 
overhead. The last two columns give times to compute just the coefficient oi . 
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